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Abstract. Map digitization is one of the most important means of 
geographic data acquisition. Automatic extraction of geographic fea- 
tures from scanned maps is the key technique to improve the effi- 
ciency of map digitization. With regard to linear feature extraction 
from color topographic map images, it is usually performed by inter- 
active or automatic line tracking based on color separated binary im- 
ages. But for those color maps with heavy inter connectedness of geo- 
graphic features and complex background such as vegetation tints 
and relief shadings, the results of color image segmentation cannot 
meet the demand of automatic extraction, and time-consuming man- 
ual tracking is inevitable. This paper presents a new semi-automatic 
method to extract linear features directly from original scanned color 
maps without color layer separation. I n the proposed method, a slid- 
ing window is added on a user- specified linear feature, and the cur- 
rent line in the window is first segmented adaptively by using color 
space conversion, k-means clustering and directional region growing. 
After that, a thinning operation is performed and the line in the win- 
dow is tracked from the centre to the edge. By moving the window 
continuously along the line, iterative operations of image segmenta- 
tion, thinning and line tracking are accomplished, thus the specified 
line is tracked automatically. Meanwhile, a little manual processing is 
i ncorporated i n case automati c tracki ng fai Is. The performance of the 
proposed method is tested on a number of color map samples with 
complex background. 
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1. Introduction 



Map digitization is one of the most important means of geographic data 
acquisition for Geographic Information System (GIS) applications. In the 
past decades, map digitization has undergone the stage of manual tracking 
on digitizing tablets and the stage of heads-up screen digitizing. Today, 
human-machine cooperative map digitization based on image processing 
and pattern recognition techniques has been more and more utilized. 

M any commercial software systems such as VPStudio (http:// www.softelec. 

com/), RxAutol mage (http://www.rasterex.com/), R2V ( http:// www. ables 

w.com/) and MapGIS (http://www.mapgis.com.cn/) are available for map 
digitization. Linear features in high quality black-and-white map images 
can be extracted automatically by using line tracking methods. For color 
topographic maps with heavy interconnectedness of geographic features, 
however, automati c extracti on is still a very difficult challenge. 

The existing methods of linear feature extraction from color topographic 
maps can be grouped into two main categories: color segmentation based 
method and original map based method. In the color segmentation based 
method, a color map image is firstly separated into several layers with 
predefined colors, and then linear features are extracted from separated 
binary layers interactively or automatically. This kind of method can signi- 
ficantly reduce the complexity of cartographic features, and make the 
extraction work relatively easier. Here, color image segmentation is a fun- 
damental and critical step. The accuracy and speed of linear feature extrac- 
tion is dependent on the quality of the segmented results. M any algorithms 
for color map image segmentation have been developed (Khotanzad & Zink 
2003, Ablameyko et al. 2003, Pezeshk &Tutwiler 2008). Nevertheless, the- 
se algorithms are far from being satisfactory to segment a color map image 
into desired color layers due to the problem of mixed color pixels and ali- 
asing induced by the scanning processes, especially for the maps with com- 
plex background such as vegetation tints and relief shadings. Noises, gaps 
and adhesions exist everywhere in the separated layers, particularly in the 
contour line layer. Although some algorithms for removing noises and 
reconnecting the broken lines have been developed (Cheng et al. 2003, 
Zengetal. 2004), a great deal of human editing is inevitable to acquire high 
quality image for automatic extraction. 

Another way to extract linear features is based on original scanned color 
maps. A linear feature is tracked automatically starting from a user- 
specified point, and some kinds of flexible user interventions are allowed in 
case where automatic tracki ng fai Is. The strategy of human-machi ne coope- 



ration is utilized here, which makes the line tracking process under human 
control, and provides the ability to correct data immediately if required. 
Therefore, it is more practical and should be a preferred one for the color 
map images with poor quality and complex background. 

To the best of our knowledge, few research works have been done to extract 
linear features directly from original scanned color maps. The rule of mini- 
mum color distance is adopted in the methods proposed by Wu & Wang 
(1998) and Huang et al. (2005) to determine the next point in the line tra- 
cking process. From the point of view of applications, such methods have 
two main disadvantages in performance: 

(1) The line tracking process depends greatly on the user-specified starting 
point. The user needs to magnify the image and select the midpoint of a line 
exactly as the starting point. If the selected point has a little deviation, the 
following line tracking cannot be ensured to be along the center line. 

(2) Automatic line tracking often fails when meeting other cartographic 
features with similar color. Linear features can hardly be tracked on those 
maps with vegetation tints and relief shadings. 

In order to overcome the above shortcomings, this paper presents a new 
method to extract linear features directly from color topographic map 
images. It is implemented by using adaptive image segmentation and se- 
quential line tracking based on sliding window. 



2. The Analysis of Color Topographic Map Images 

Topographic maps typically use only a few distinct colors to represent diffe- 
rent cartographic feature layers, for example, black cultural features, brown 
geomorphologic features, blue drainage features, and green vegetation fea- 
tures. I nfluenced by the degradation of paper map, the interconnectedness 
of cartographic features, and the RGB misalignment in the scanning pro- 
cess, large numbers of scattered colors and noises are generated in a scan- 
ned map image. This lead to the phenomenon that features in the same lay- 
er do not have same color, and similar colors do not belong to same feature 
layer, therefore introduce great difficulty for image segmentation based on 
color information. 

Figure 1 shows a part of a color topographic map with relief shadings and 
the color segmentation result of contour line layer. We can see that shading 
areas adhere together and many broken lines occur in the segmented layer. 
Automatic linear feature extraction can not be performed at all on such low 
quality image. 




Figure 1 A part of a color topographic map with relief shadings, (a) Original im- 
age, (b) Color segmentation result of contour line layer. 



3. The proposed approach 

In order to extract linear features directly from original color map images 
without color segmentation, it is necessary to distinguish linear features 
adaptively from complex background. In a color topographic map image, 
different regions usually show marked differences in color, brightness and 
contrast, especially those regions with vegetation tints and relief shadings. 
So it is difficult to distinguish linear features from the background using a 
global method. Considering this, we propose a local adaptive segmentation 
method based on sliding window to separate a specified linear feature from 
background, followed by a sequential line tracking to extract the line. 

Figure 2 shows the procedure of linear feature extraction. For a specified 
linear feature, a starting point and initial direction are first input by the 
operator, and a predefined rectangle window (which is called sliding win- 
dow) is added on the line. Then, adaptive image segmentation, thinning, 
and I i ne tracki ng are performed i n the wi ndow. By movi ng the wi ndow con- 
tinuously along the line and doing the above operations iteratively, the line 
is tracked sequentially until an endpoint or an intersection is met. If the 
tracking is broken or a tracking error occurs, manual operation is necessary 
to cross the intersection or move back to the correct point. After that, the 
sequential line tracking continues until the whole line is extracted. 



Input a starting point and direction 
Sliding window creation 

I 

Adaptive image segmentation 



Thinning 




Figure 2, The procedure of linear feature extraction from scanned color maps. 

3.1. Adaptive image segmentation based on sliding window 

The proposed image segmentation approach based on sliding window is 
performed by using color space conversion, k-means clustering and direc- 
tional region growing. 

Color space conversion 

The color image in the sliding window is first converted into a 256 grey- 
scale image so as to reduce the complexity of the problem. This is due to the 
following considerations: Firstly, color confusion can be improved in the 
image with limited grey-scale. Secondly, it is relatively easy to separate ob- 
jects from the grey- scale image because there is a marked contrast between 
foreground and background. 

YIQ color space is adopted here to separate grey- scale information from 
color data. In this color space, image data consists of three components: 



luminance (Y), hue (I ), and saturation (Q). The first component represents 
grey-scale information, while the last two components represent color in- 
formation. By converting an RGB image into YIQ format the grey-scale 
information can be extracted without loss (Chen et al. 2000). 

The conversion from RGB to YIQ (Ford & Roberts 1998) can be expressed 
as: 
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where Y component is the grey-scale value converted from RGB value. 

Figure 3(a) and 3(b) shows the color image and the converted grey-scale 
image in a25x 25 sliding window of Figure 1(a). 
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(a) (b) (c) 

Figure 3. The image in a 25x 25 sliding window, (a) Color image, (b) Gray-scale 
image, (c) Segmentation by thresholding. 

In the image with low contrast and low signal -to- noise ratio (SNR), it is 
difficult to separate the object from background by using traditional image 
segmentation methods such as thresholding method (see Figure 3(c)). 
Next, a new algorithm combining k-means clustering with directional re- 
gion growi ng is presented to segment the specified I i near feature i n the si id- 
ing window. 

K-means clustering 

K-means clustering is one of the simplest unsupervised learning algorithms 
for solving clustering problem (Wagstaff et al., 2001). I n our algorithm, it is 
applied to a small neighbourhood in the centre of the sliding window to 
divide the pixels into object and background regions. For a 300dpi topo- 
graphic map image, a 5x 5 neighbourhood is selected considering the line 
width and the interval between two lines. 



The process of k-means clustering is as follows. 



Step I Choose a seed point with minimum grey-scale from the 5x 5 neigh- 
bourhood of the centre of the si i di ng wi ndow. 

Step 2: Create the initial clustering centres of the object and background 
region Cl and c 2 respectively by finding the maximum and minimum grey- 
scalevalues in the5x 5 neighbourhood of the seed pixel. 



Step 3: Compute d x 



and dn = 



for each pixel with grey-scale 
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in the 5x 5 neighbourhood. If d l <d 2 , then the pixel belongs to the object 
region and the background otherwise. 

Step 4: Compute the average grey- scales of the object and background re- 
gion mx and m 2 respectively. I f they converge to q and c 2 , goto Step 5; oth- 
erwise, assign mi and m 2 to the clustering centres q and c 2 , goto Step 3. 

Step 5: Set all the grey-scale values of the pixels belonging to the target re- 
gion within the5x 5 neighbourhood to be land otherwise. 

For the grey-scale matrix of Figure 3(b), k-means clustering is performed in 
the 5x 5 neighbourhood of the seed pixel (Its grey-scale value is 105). Fig- 
ure 4 shows the object and background region after k-means clustering. 
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(a) (b) 

Figure 4, K-means clustering within a 5x 5 neighbourhood, (a) Grey-scale matrix, 
(b) Clustering result. 

It should be noted that the small neighbourhood of k-means clustering can 
be adjusted with the change of line width and the resolution of map image. 

Directional region growing 

Based on the k-means clustering result the object region is expanded to the 
whole sliding window by using the proposed directional region growing 
algorithm. Before giving the algorithm, the initial direction given by the 
operator is transformed into eight discrete directions di, 62, ds (see Fig- 
ure 5(a)), and the four sides of a 5x 5 neighbourhood centred at a seed pixel 
are defined asl_i, L 3 , L 5 and L 7 (see Figure 5(b)). 
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Figure 5. (a) 8 discrete directions, (b) The four sides of a5x 5 neighbourhood. 

The directional region growing is described as follows. 
Step 1: Initialization. 

® Find the pixel with minimum original grey-scale value along Li 
(i =1,3,5,7) within the target region as a new seed if the initial direction is di 
(1=13,57). 



® Find the pixel with minimum original grey-scale value along Lm and Lj+i 
(1=2,4,6,8; L 8 +i=Li ) within the target region as a new seed if the initial di- 
rection is di (i =2,4,6,8). 

Step 2: Perform k-means clustering within the 5x 5 neighbourhood of the 
seed, obtaining new clustering centre Cl and c 2 . 

Step 3: Find the pixels in the5x 5 neighbourhood meeting the foil owing two 
conditions. Add them into the object region, and set their grey-scale values 
to bel 

Condition I The grey-scale difference between the pixel and the object 
clustering centre is smaller than that between the pixel and background 
clustering centre, i.e., |^_ q |<|^_ C2 . 

Condition 2: There is at least one pixel with binary value 1 in the 8- 
neighborhood. 

Step 4: Find the pixel with minimum original grey- scale value in those pix- 
els belonging to the newly grown target region at the four sides of the 5x 5 
neighbourhood as a new seed. 

Step 5: Repeat Step 2-Step 4 unti I the edge of the si idi ng wi ndow is reached. 

Step 6: Set the grey-scale values of pixels belonging to the background re- 
gion to beO. 

After the above steps, the segmentation result in the sliding window is ob- 
tained (seeFigure6). 
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Figure 6. K-means clustering and directional region growing in the sliding win- 
dow. 

In the algorithm, k-means clustering and directional region growing are 
performed automatically in the sliding window to separate the object from 
background no matter how the brightness and contrast change, therefore it 
is an adaptive segmentation algorithm. 

3.2. Sequential line tracking 

After image segmentation, a thinning operation is performed, followed by a 
line tracking process in the sliding window. By moving the window contin- 
uously along the line and doing the above operations iteratively, the line is 
tracked sequentially until an endpoint or an intersection is met. 

Before giving the algorithm, we first introduce several terms in binary 
thinned images (see Figure 7). 

Endpoint: The point with one black pixel in the 3x 3 neighbourhood. 

Connecting point: The point with two black pixels in the 3x3 neigh- 
bourhood. 

Crossing point: The point with more than three black pixels in the 3x 3 
neighbourhood. 



H H as 

(a) (b) (c) 

Figure 7. (a) Endpoint. (b) Connecting point, (c) Crossing point. 

The sequential line tracking algorithm is described as follows. 

Step 1: Find a connecting point P as the starting point in a 3x 3 neigh- 
bourhood i n the centre area of the si i di ng wi ndow. 

Step 2: Select a point from the two black pixels in the 3x 3 neighbourhood 
of Po with a close direction d to the current direction (the initial direction is 
input by a human operator). 

Step 3: Set the current tracking point to be white, and track to the next 
black point Pi in direction d (i is counted from 1). 

Step 4: Distinguish Pi in the 3x 3 neighbourhood. There are three cases as 
follows. 

® If Pi is a connecting point there remains only one black pixel in its 3x 3 
neighbourhood except the tracked point use its direction as tracking direc- 
tion d, goto Step 3. 

® If Pi is a crossing point judge that it is a true or pseudo crossing point 
(more is said about this in the foil owing). If it is true, stop tracking; other- 
wise, j udge the forward direction d, go to Step 3. 

(!) If Pj is an endpoint or a side point of the sliding window, go to Step 5. 

Step 5: Count the number of the tracking points. If it is less than 3, stop 
tracking; otherwise, move the centre of the sliding window to the current 
poi nt, and go to Step 1 

I n the above process, crossing points may be encountered due to intersec- 
tions between different linear features, or due to noise. The former is called 
true crossing point while the latter is called pseudo crossing point. When a 
true crossing point is met, automatic line tracking stops for a moment, and 
the next point is input manually. After that, the line tracking continues as 
before. When a pseudo crossing point is met, an automatic judgment of 
forward direction is needed to across the pseudo crossing point. 

To handle the pseudo crossing point, a concentration degree^/, is de- 
fined, which means the total number of black pixels within a limited region. 
Suppose that S is the point in the un-thinned binary image corresponding 
to the crossi ng point P. If the concentration degree around S is larger than a 



preset value, then P is treated as a true crossing point. Otherwise it is a 
pseudo one. For topographic maps with resolution of 300 dpi, the concen- 
tration degree can be set as 20 within a 5x 5 region. When meeting a pseu- 
do crossing point a trial -tracking is done to determine the next tracking 
direction. As shown in Figure 8, there are two forward directions dl and d2 
at the pseudo crossing point P. First three connecting pixels along dl and 
d2 are tracked respectively, and each of their corresponding grey-scale val- 
ues in the grey-scale image are recorded. Then, the average grey-scale val- 
ues are calculated for dland d2, respectively. If the former is smaller, dlis 
determined as the next tracking direction, elsed2. 




Figure 8. Determination of the forward direction at a pseudo crossing point. 

Figure 9 illustrates the continuous sliding window and sequential line 
tracking from a starting point S with an initial direction pointed by the ar- 
row. Figure 10 shows the results of image segmentation, thinning and line 
tracking in window 1-6 in Figure 9(a). The grey line marked in each win- 
dow draws a tracking path. By connecting all the grey lines in order, the 
tracking result is obtained (seeFigure9(b)). 





(a) 



(b) 



Figure 9. Sequential line tracking, (a) Continuous sliding window, (b) The result 
of line tracking. 
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(c) 

Figure 3D. (a) Images in window 1-6 in Figure 9a. (b) Corresponding segmenta- 
tion results of current linear feature, (c) Corresponding results of thinning and line 
tracking. 



4. Experiments and analysis 

Experiments have been conducted to test our proposed method. Figure 11(a) 
is a part of a topographic map with relief shadings. The size of the image is 
300x 300 pixels, the resolution is 300 dpi, and the sliding window is 25x 25 
pixels. For each contour line, once a starting point and direction are input 
by a human operator, it can be tracked automatically. I n the case that an 
intersection or a gap is met, automatic tracking will stop, and a next point 
on the line is input manually. After that, the line tracking continues. If the 
gap is wider, a few points should be collected manually to pass it. Figure 
11(b) is the extracted result of contour lines. Figure 12(a) is a part of anoth- 
er topographic map with vegetation tints. The image size, the scanning 
resolution, and the window size remain unchanged. Figure 12(b) is the ex- 
tracted result of contour lines. The average time of extracting contour lines 
in Figure 11 and Figure 12 are 200 seconds and 160 seconds respectively on 



a 3 GHz Pentium (R) 4 computer. Most of the time was taken by manual 
i nput of starti ng poi nts and some i nterventions duri ng the tracki ng process, 
while the time required by automatic tracking is very short. A number of 
topographic map samples have been used to test our algorithm, and satis- 
factory results have been achieved. 




(a) (b) 

Figure 12. A part of another topographic map with vegetation tints, (a) Original 
image, (b) Extracted result of contour lines. 



A comparison with commercial software MapG IS has been made. For color 
maps with higher contrast between topographic features and background, it 
took nearly the same time to extract linear features by using the proposed 
method and M apGI S. But for the map images with lower contrast and lower 
SNR, our method is obviously more efficient. It took about 540 seconds to 
extract contour lines in Figure 11 by using MapGIS. Errors (as shown in the 
white circle areas in Figure 13) often occur in the line tracking process, and 
more human interventions are needed to handle these problems. 
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From the experiments, the proposed method demonstrates the following 
advantages: 

(1) Most of the work of line tracking can be finished automatically while 
human operators only need to give the starting point and the directional 
point. Some kinds of manual interventions are allowed in case automatic 
tracking fails, which makes line tracking under human control, and pro- 
vi des the abi I ity to correct data i mmedi ately if requi red. 

(2) Linear features can be tracked accurately along the centerline after im- 
age segmentation and thinning. This can avoid deviation from the center- 
line by usi ng only color distance to track poi nts. 

(3) The sliding window is updated continuously and the segmentation re- 
sult in each window depends on the grey-level distribution and spatial rela- 
tionship in current window. This makes line tracking adapt to color varia- 
tions. 



5. Conclusion 

This paper presents a new method to extract linear features directly from 
scanned color topographic maps. The process of sliding window creation, 
adaptive image segmentation, thinning and sequential tracking can be used 



as general steps for linear feature extraction. Future improvement mainly 
focuses on automatic tracking across the intersections and gaps so as to 
reducehuman intervention and improve the efficiency of map digitization. 
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